-
Notifications
You must be signed in to change notification settings - Fork 262
feat(core)!: more explicit handling of case-sensitivity in dictionaries #2630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
`SpellCheck` shouldn't handle capitalization if `OrthographicConsistency` is going to do it anyway.
- Splits `WordId` into `CanonicalWordId` and `CaseFoldedWordId`. - Updates dictionary functions to more explicitly handle casing. There are now functions to get a specific word case-sensitively, multiple words case-insensitively, and ditto but merge all metadata (which was the old behavior). - Fixes issue where `SpellCheck` would sometimes mark words as incorrect if an identical entry with different casing existed in the dictionary (e.g. OS, PR, etc.). - Makes `SpellCheck` no longer care about casing, since that is handled by `OrthographicConsistency`.
…mattic#2476)" This partially reverts commit 5230d6a. Returns the word to the dictionary, since removing it should no longer be necessary.
Since casing-related issues are now handled by `OrthographicConsistency`, not `SpellCheck`.
Expands the criteria in which `OrthographicConsistency` will lint for incorrect capitalization. Makes the `no_improper_suggestion_for_macos` test pass.
Allow all word casing/orthography that are defined in the dictionary. If the dictionary contains the exact word, `OrthographicConsistency` will skip it.
Both variants are defined in the dictionary, and appear to be valid in this case.
The test would expect 'Al' to be linted by `OrthographicConsistency` for not being all-caps.
Remove needless borrow.
|
Does it handle the same word having multiple This happens a lot more than expected and causes some bugs of this type to crop up. I had a PR for it for many months that just tracked them but didn't really do anything with them as I wasn't sure what was and wasn't planned etc. |
|
I don't recall everything perfectly at the moment but both I think both cases are possible and only one is handled:
I believe you're talking about 1. which works but is behind a few bugs. But 2. is distinct. I found my old PR: #1035 Apologies if I've mixed up any concepts myself - it's been a while (-: |
|
Oh, I see what you mean now. Yeah no, at the moment I think this PR mostly sticks with the old behavior where once a I did change the field to store a I'll take another look though. I wasn't too certain with that change as is, and the PR you mentioned does bring up an issue I haven't even considered. |

Issues
Fixes #2585
Related to #1688
Related to #2411
(Probably a few more, this has been a long-running issue.)
Description
WordIdwithCanonicalWordIdandCaseFoldedWordId.CanonicalWordIdhashes the input word as is, without lowercasing or normalization.CaseFoldedWordIdlowercases and normalizes the word before hashing it (this is the current behavior ofWordId).WordIdPairandEitherWordIdto make it easier to work withCanonicalWordId/CaseFoldedWordId.Vecof words, not an individual word.get_word_metadata_combinedfunctions.SpellCheckdespite being present in the dictionary.SpellCheckno longer care about capitalization, fully transferring that responsibility toOrthographicConsistency(which already handles it anyway for the most part).How Has This Been Tested?
cargo testChecklist